Here we show scatterplots comparing expression levels for all genes between the different samples, for i) all controls, ii) all treatment samples and iii) for all samples together. These plots will only be produced when the total number of samples to compare within a group is less than or equal to 10.
Replicates within the same group tend to have Pearson correlation coefficients >= 0.96. Lower values may indicate problems with the samples.
Replicates within the same group tend to have Pearson correlation coefficients >= 0.96. Lower values may indicate problems with the samples.
Correlation coefficients tend to be slightly higher between replicates from the same group than between replicates from different groups. If this is not the case, it may indicate mislabelling or other potential issues.
BROWN: higher correlation; YELLOW: lower correlation
This section explore the relationship between supplementary categorical variables (-S option), samples and PCs.
| factor | dimension | correlation | p.value |
|---|---|---|---|
| ENSG00000131174 | PC1 | 0.998945795076717 | 1.66643623613467e-06 |
| ENSG00000196743 | PC1 | 0.997356412738594 | 1.04735929873508e-05 |
| ENSG00000142168 | PC1 | 0.997212894283342 | 1.16411123532835e-05 |
| ENSG00000075223 | PC1 | 0.996642788667184 | 1.68873825570051e-05 |
| ENSG00000106615 | PC1 | 0.996554700266559 | 1.77846873704754e-05 |
| ENSG00000232344 | PC1 | 0.995881316348151 | 2.54103987776388e-05 |
| ENSG00000163110 | PC1 | 0.995591191875218 | 2.91135353213767e-05 |
| ENSG00000134551 | PC1 | 0.994314071044831 | 4.84027696857566e-05 |
| ENSG00000172209 | PC1 | 0.994283241823095 | 4.89285704665584e-05 |
| ENSG00000241764 | PC1 | 0.99409943847548 | 5.21222206341813e-05 |
| ENSG00000121749 | PC2 | 0.991304081743919 | 0.000113099703178063 |
| ENSG00000164961 | PC2 | 0.990608784060551 | 0.000131878276383805 |
| ENSG00000138028 | PC2 | 0.989408419008535 | 0.000167678291661149 |
| ENSG00000180182 | PC2 | 0.989374906155233 | 0.000168739181478416 |
| ENSG00000115966 | PC2 | 0.986564389145977 | 0.000269560789591654 |
| ENSG00000143344 | PC2 | 0.986388458924706 | 0.000276650142992706 |
| ENSG00000029725 | PC2 | 0.986386152431107 | 0.000276743694660941 |
| ENSG00000163291 | PC2 | 0.985052643688155 | 0.000333465396032014 |
| ENSG00000118965 | PC2 | 0.984038531386474 | 0.000380119480899824 |
| ENSG00000268886 | PC2 | 0.981623330058124 | 0.000503450078149581 |
These boxplots show the distributions of count data before and after normalization (shown for normalization methoddefault
Samples are ranked by total expressed genes. Union of expressed genes represents the cumulative total expressed genes (sum of all genes expressed in any sample up to current sample, expected to increase with sample rank). Intersection of expressed genes represents the cumulative intersection of expressed genes (sum of genes expressed in every sample up to current sample, expected to decrease with sample rank)
This plot represents the mean counts distribution per gene, classified by filters
Variance of gene counts across samples are represented. Genes with lower variance than selected threshold (dashed grey line) were filtered out.
All counts were normalizated by default (see options below) algorithm. These counts have been scaled by log10 and plotted in a heatmap.
| rownames | ENSG00000276168 | ENSG00000198804 | ENSG00000275395 | ENSG00000198886 | ENSG00000198786 |
|---|---|---|---|---|---|
| control_1 | 5.921 | 4.922 | 1.952 | 2.282 | 2.13 |
| control_2 | 5.918 | 4.622 | 2.597 | 2.173 | 2.07 |
| control_4 | 5.328 | 4.478 | 2.489 | 2.121 | 1.904 |
| gtdup_1 | 7.97 | 2.444 | 3.054 | 1.126 | 0.984 |
| gtdup_2 | 9.063 | 3.977 | 4.255 | 2.172 | 1.856 |
| gtdup_3 | 2.3 | 1.766 | 4.133 | 0.78 | 0.71 |
| Sample Names: |
|---|
| control_1 |
| control_2 |
| control_4 |
| Sample Names: |
|---|
| gtdup_1 |
| gtdup_2 |
| gtdup_3 |
DEgenes Hunter uses multiple DE detection packages to analyse all genes in the input count table and labels them accordingly.
Note: A positive log fold change shows higher expression in the treatment group; a negative log fold change represents higher expression in the control group.
This barplot shows the total number of genes passing each stage of analysis - from the total number of genes in the input table of counts, to the genes surviving the expression filter, to the genes detected as DE by one package, to the genes detected by at least 2 packages.
This is the Venn Diagram of all possible DE genes (DEGs) according to at least one of the selected DE detection packages
This graph shows logFC calculated (y-axis) for each package (points) and gene (x-axis). Only genes with variability over 0.01 will be plotted. This representation allows to user to observe the behaviour of each DE package and see if one of them has atypical results.
If there are no genes showing sufficient variance in estimated logFC accross methods, no plot will be produced and a warning message will be given.
Benchmark of false positive calling: Boxplot of FDR values among all genes with an FDR <= 0.05 in at least one DE detection package
The red horizontal line represents the FDR threshold, which has been set to 0.05
The black lines represent other values.
This section explore the relationship between supplementary categorical variables (-S option), samples and PCs.
| factor | dimension | correlation | p.value |
|---|---|---|---|
| ENSG00000131174 | PC1 | 0.998945795076717 | 1.66643623613467e-06 |
| ENSG00000196743 | PC1 | 0.997356412738594 | 1.04735929873508e-05 |
| ENSG00000142168 | PC1 | 0.997212894283342 | 1.16411123532835e-05 |
| ENSG00000075223 | PC1 | 0.996642788667184 | 1.68873825570051e-05 |
| ENSG00000106615 | PC1 | 0.996554700266559 | 1.77846873704754e-05 |
| ENSG00000232344 | PC1 | 0.995881316348151 | 2.54103987776388e-05 |
| ENSG00000163110 | PC1 | 0.995591191875218 | 2.91135353213767e-05 |
| ENSG00000134551 | PC1 | 0.994314071044831 | 4.84027696857566e-05 |
| ENSG00000172209 | PC1 | 0.994283241823095 | 4.89285704665584e-05 |
| ENSG00000241764 | PC1 | 0.99409943847548 | 5.21222206341813e-05 |
| ENSG00000121749 | PC2 | 0.991304081743919 | 0.000113099703178063 |
| ENSG00000164961 | PC2 | 0.990608784060551 | 0.000131878276383805 |
| ENSG00000138028 | PC2 | 0.989408419008535 | 0.000167678291661149 |
| ENSG00000180182 | PC2 | 0.989374906155233 | 0.000168739181478416 |
| ENSG00000115966 | PC2 | 0.986564389145977 | 0.000269560789591654 |
| ENSG00000143344 | PC2 | 0.986388458924706 | 0.000276650142992706 |
| ENSG00000029725 | PC2 | 0.986386152431107 | 0.000276743694660941 |
| ENSG00000163291 | PC2 | 0.985052643688155 | 0.000333465396032014 |
| ENSG00000118965 | PC2 | 0.984038531386474 | 0.000380119480899824 |
| ENSG00000268886 | PC2 | 0.981623330058124 | 0.000503450078149581 |
DEgenes Hunter differential expression analysis results can be found in file Common_results/hunter_results_table.txt
Various plots specific to each package are shown below:
The effective library size is the factor used by DESeq2 normalization algorithm for each sample. The effective library size must be dependent of raw library size.
This plot compares the effective library size with raw library size
The effective library size is the factor used by DESeq2 normalization algorithm for each sample. The effective library size must be dependent of raw library size.
This is the MA plot from DESeq2 package
In DESeq2, the MA-plot (log ratio versus abundance) shows the log2 fold changes are attributable to a given variable over the mean of normalized counts. Points will be colored red if the adjusted Pvalue is less than 0.1. Points which fall out of the window are plotted as open triangles pointing either up or down. A table containing the DESeq2 DEGs is provided: in Results\_DESeq2/DEgenes\_DESEq2.txt A table containing the DESeq2 normalized counts is provided in Results\_DESeq2/Normalized\_counts\_DESEq2.txt
Counts of prevalent DEGs were normalizated by DESeq2 algorithm. This count were scaled by log10 and plotted in a heatmap.
This is the MA plot from package edgeR
Differential gene expression data can be visualized as MA-plots (log ratio versus abundance) where each dot represents a gene. The differentially expressed genes are colored red and the non-differentially expressed ones are colored black. A table containing the edgeR DEGs is provided in Results\_edgeR/DEgenes\_edgeR.txt A table containing the edgeR normalized counts is provided in Results\_edgeR/Normalized\_counts\_edgeR.txtThis is an advanced section that allows comparing the output of packages unadjusted for DE analysis. The data shown here do not necessarily reflect biological impact.
Distributions of p-values, unadjusted and adjusted for multiple testing (FDR)
First column contains the option names; the second contains the given values for each option in this run
| opt | |
|---|---|
| input_file | /Users/marmtnez/Desktop/Master_Bioinfo/TFM/Files/final_counts.txt |
| pseudocounts | FALSE |
| reads | 2 |
| count_var_quantile | 0 |
| minlibraries | 2 |
| filter_type | separate |
| output_files | /Users/marmtnez/Desktop/Master_Bioinfo/TFM/Results/degenes/ctrl_vs_gtdup_degenes |
| p_val_cutoff | 0.05 |
| lfc | 1.5 |
| modules | DE |
| minpack_common | 2 |
| target_file | /Users/marmtnez/Desktop/Master_Bioinfo/TFM/Files/ctrl_vs_gtdup_target.txt |
| model_variables | |
| numerics_as_factors | FALSE |
| string_factors | |
| numeric_factors | |
| WGCNA_memory | 5000 |
| WGCNA_norm_method | DESeq2 |
| WGCNA_deepsplit | 2 |
| WGCNA_min_genes_cluster | 20 |
| WGCNA_detectcutHeight | 0.995 |
| WGCNA_mergecutHeight | 0.25 |
| WGCNA_all | FALSE |
| WGCNA_blockwiseNetworkType | signed |
| WGCNA_blockwiseTOMType | signed |
| WGCNA_minCoreKME | 0.7 |
| WGCNA_minKMEtoStay | 0.5 |
| WGCNA_corType | pearson |
| multifactorial | |
| help | FALSE |